1/20 




2/20 




204 
205 
206 
207 
208 
209 
210 



76 Box St Townsvil le NSW 
231 Box Road Towns ville QLD 
53 3rd Ave, Townsville 4321 QUD 
35 Third Avenue, ^o^^^^f^^ 
333 Mt Pleasant Road, Springvale 
191 springvale Road, Mt Pleasant 
123 Sydney Ave, Melbourne VIC 



Figure 2 
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313 



306 




301 



Figure 3 
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Address 



Address 1 
Address 2 
Address 3 
Address 4 
Address 5 



Address 1 
Address 2 
Suburb 
State 
Postcode 



Unit# 
Street Name 
Town 
State 
Postcode 



House # 



Figure 4 
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< 

504 




procedure 



Figure 5 
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602 



Mr 



Fred 



and 



Mrs 



Given 
Name 


Surname 


i 


Word 


Word 


1 


Mary 


Smith 



Figure 6 
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Figure 7 



8/20 





aa aaai bb bbb bb *ccc cc 




804 



805 



806 




Figure 8 



Address 




String to be replaced 



Figure 9 
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Street 



Level 



Level 
Type 



Level 
Number 



Street 
Number 

~1 

Number 



Number 



Street 
Name 

Word 



Street 
Type 



Level 



38 



Mayne Street 



Replacement String 



Figure 10 
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Figure 11 
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1201 




1203 



for each pair of 
sub-components 
with same type 





Perform 
String 
Comparison 










► 


Call this 
procedure to 
get Match 
Confidence 





1205 



1205 



1206 



For each 
sub-component 



Record best 

match 
confidence 



1207 



Multiply best 
confidence level & 
matching weight 



1208 



Sum ail 
resulting node 
values 



1210 



1209 



Divide total value 

by sum of all 
sub-component 
match weightings 



Return 
Matching 
Confid. 



Figure 12 
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1302 



1306 



1303 



1305 



1304 



1310 




12 



14 Kathryn St Townsville 



(20*100) (0*0) (10*0) (00*80) (10*100) (40*100) 



1314 



1317 
1316 




1313 



Street 



Address 



1320 



Street 
Num ber 

~T~ 



1318 



Street 
Name 



Number 



Street 
Type 



Word 



1321 



Town 



Word 



1315 



1319 



14 Catherine Street Townsville 



(30*100) (60*80) (10*100) (40*100) 



Figure 13 
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.1401 



1402 




Yes- 



1405. 





Relocate 






Text Data + 






free space 






r 



1403 



Calculate Text 
Object Space 
Requirements 




Relocate 
Text Object 
+ free space 



Shift "after" 
string by Drff }<- 
postltlons 



-Yes 



1410 



For each 
node after 



Add DSffto 
start address 



For each 
other node 
which 
\ c on tains new J 



1412 



Adjust node's 
length by Drff 



Adjust 
Text Object 
free space 

by Diff 



1414 



-Yes 



1407 




1404 



No 



1408 




1411 



1413 



1415 



Set Error 
Condition 



1416 



-W Return 



Figure 14 
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1503 



1502- 



Symbol 
Table 



1505 



Character 
Defintion 



1501 



Domain 




1504 



Lexicon 



Parse 
Table 



Regular 
Express 



Dictionary 



1507 



1506 



Figure 15 



1601. 



1602- 



1603. 




Character 
Definition 




Regular 
Expression 



Grammar 



1607 



Free-format 
Data 

Attribute 
Type Name 




Construct 
Domain 
Process 



-1604 



-1605 



Domain Object 



Save & Load 




-1605 



Domain Object 



Text Object 



1606 



1608- 



Figure 16 
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Standard Japanese Katakana Transliteration 



a T 




u * 






ka# 


ki 


kii? 


ke-^T 






si 


su^ 




so 


ta 


ti^P 


tu 


te 


to h 


na^ 


ni 


nu^ 


ne 




ha 


hi tl 


\va7 


he^ 


ho 


ma^ 


mi 


mu^> 


meT^ 


mot 


ya^f 








yo3 


ra 


ri 


ru 


reV 


ro 


wal7 


wi 








ga 


gi 


gu 




go 


za 


zi 


zu 


ze 


zo 


da 


di 


du 


de 


do 


ba 


bi 


bu 


be 


bo 


pa 


Pi 


pu 


pe 


po 


n 











Standard Greek Transliteration 



A 


a 


a 


I 


t 


i 


p 


p 


r 


B 


P 


V 


K 


K 


k 


2 


a 


s 


r 


y 


g 


A 




1 


T 


T 


t 


A 


5 


d 


M 




m 


Y 


O 


u 


E 


e 


e 


. N 


V 


n 


<D 




f 


Z 


c 


z 




% 


X 


X 


X 


ch 


H 




i 


O 


o 


0 






ps 


e 


e 


th 


n 


71 


p 


a 


CD 


0 
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Sample Regular Expression Definition . 







Action 
** 


alpha 


digit 


symbol 


space 


end ot 
line 


end of 
string 


0 


Error 


0 














1 


start 


1 


3 


9 


13 


1 


16 


18 


2 


empty 

TV 


4 














3 


initial 


2 


6 


5 


5 


4 


5 


5 


4 


Initial space+ 


3 


5 


5 


5 


4 


5 


5 


5 


Initial A space 


5 














6 


Word+ 


2 


6 


8 


8 


7 


8 


8 


7 


Word+ space 


3 


8 


8 


8 


7 


8 


8 


8 


Word* *space 


7 














9 


0-9 


2 


12 


10 


12 


11 


12 


12 


ia 


0-9+ 


2 


12 


10 


12 


11 


12 


12 


11 


0-9+ space 


3 


12 


12 


12 


11 


12 


12 


12 


0-9+ A space 


8 














13 


sym 


2 


15 


15 


15 


14 


15 


15 


14 


syrn space 


3 


15 


15 


15 


14 


15 


15 


15 


syrn A space 


10 














16 


eol+ space* 


3 


17 


17 


17 


16 


16 


18 


17 


eol+ ^pace 


11 














18 


eoi 


9 


0 


0 


0 


0 


0 


0 



** Action Action Description 



0 


Error in Table 


1 


Bypass leading spaces 


2 


Append this character to character buffer 


3 


Append trailing space to character buffer 


4 


Empty string 


5 


Create "initial" token; go back 1 char, set state to 1 


7 


Create "word" token; go back 1 char; set state to 1 


8 


Create 'number* token; go back 1 char, set state to 1 


9 


Create "end of input" token; go back 1 char; set state to 1 


10 


Create "symbol" token; go back 1 char; set state to 1 


11 


Create "end of line* token; go back 1 char; set state to 1 



Figure 18 
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address 

-> StreetAddr, Town Zipcode State 
[ PostBox, Town Zipcode State ; 
StreetAddr 
-> Street 
I StreetNum Street 
1 AptType AptNum StreetNum Street 
| StreetNum Street AptType AptNum 
Street 

-> StreetName StreetType StreetDir 
i StreetName StreetType ; 
StreetName 

-> word ! word word ; 
StreetNum -> nbr ; 
AptNum -> nbr ; 
StreetType 

-> "Ave" 1 "Avenue" ("Ave") 
| "Rd M | "Road" ("Rd") 
[ "St" i "Street" ("St") ; 
StreetDir 

-> Geo ; 
3eo 

-> "North" 
i "South" 
] "East" 
| "West" 
?\ptType 

> "Apt" 
I "Unit" 
I "Suite 
Zipcode 

-> 99999 
PostBox 

-> PostPref PostBoxNum ; 
PostPref 

-> "PO Box" 1 "Box" ; 
PostBoxNum 

-> nbr i nbr A I A nbr ; 
Town 

-> word 1 word word 
I Geo word I Geo word word 
State 



I 



"S" 
"E" 



("North") 
("South") 
("East") 
| "W" ("West") ; 

( "Apartment" 

" 1 "Ste" ; 

| 99999 "-" 9999 



> "ALABAMA" 


{ " AL" ) 


"Ali" 


1 "ALASKA" 


( " AK" ) 


"AK" 


I "ARIZONA" 


("AZ") 


| "AZ" 


1 "ARKANSAS" 


("AR") 


I "AR" 


1 "CALIFORNIA" 


<"CA") 


| "CA" 



:-2 



Special Symbols: 




-> 


Consists of 


I 


Or 




Rule terminator 


) 


Matching 
equivalence 


=+-9= 


Matching 

S i gnif i cance 

Adjustment 




Zero Matching 
Significance 


:+9 


Parsing 
Significance 
Adj ustment 


Reserved Words: 


word. 


one or more letters 


nbr 


one or more digits 


A 


one letter 


9 


one digit 




comma or nothing 





Figure 19 
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START 



2001 



J 



Load Character 
Definition Data 



2002 



2003 



Load Regular 
Expression 
Defintion Data 



2004 



2006 



\ 



For each Rule 
in Grammar 



2007 



Process 
Grammar 
Rule 



For each 
symbol in 
Symbol table 



-Yes- 




-No- 



2009 



For each 
symbol in 
Symbol table 



Create Parse 
Table 



v 



2008 



Set Error 
Condition 



2010 



Store Parse table 
reference in 
corresponding 
Symbol table 




-H Return 



Figure 20 
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2101 



2102 



Create new 
rule in Rule 
Table 



2103 



Add LHS 
symbol to 
Symbol Table 



2110 




2104 



Return W- 



For each 
symbol in 
RHS of rule 



2105 



2106 



2107 




Add entry to 
Dictionary 



Do nothing 



Add symbol to 
Symbol Table 



Figure 21 
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SQL Database implementation Example 



1. CREATE DOMAI NOB JECT US_ADDRESS ; 

2. UPDATE US_ADDRESS 

SET LANGUAGE = EXTERNAL *Path/ English . txt' , 
GRAMMAR = EXTERNAL x Path/US Addr.txt' , 



3. CREATE TEXTOBJECT ADDRESS ; 

4. UPDATE US_ADDRESS 

SET DOMAIN = US_ADDRESS, 
TYPE « ^Address" ; 



5. CREATE TABLE PERSONS { 

Name CHAR (20), 

Home_Addr ADDRESS ) ; 

6. INSERT INTO PERSONS ( Name, Home_Addr ) 

VALUES ( 

xx John Smith", 

"123 Cathy Street, Apt 5, Huntsvale, CA, 98765" } ; 

7. SELECT FROM PERSONS 

WHERE Home_Addr = M Unit 5 123 Cathy St, Huntsvale, CA 4 

8. SELECT FROM PERSONS 

WHERE Home_Addr. State = "California" ; 

9. SELECT FROM PERSONS 

WHERE Home Addr. Street MATCHES "Kathie St" > 0.80 ; 



New Reserved Words: 



DOMAINOB JECT , TEXTOBJECT, LANGUAGE, 
GRAMMAR, TYPE, MATCHES 



Figure 22 



